19 research outputs found

    Dublin City University at QA@CLEF 2008

    Get PDF
    We describe our participation in Multilingual Question Answering at CLEF 2008 using German and English as our source and target languages respectively. The system was built using UIMA (Unstructured Information Management Architecture) as underlying framework

    A hybrid filtering approach for question answering

    Get PDF
    We describe a question answering system that took part in the bilingual CLEFQA task (German-English) where German is the source language and English the target language.We used the BableFish online translation system to translate the German questions into English. The system is targeted at Factoid and Denition questions. Our focus in designing the current system is on testing our online methods which are based on information extraction and linguistic ltering methods. Our system does not make use of precompiled tables or Gazetteers but uses Web snippets to rerank candidate answers extracted from the document collections. WordNet is also used as a lexical resource in the system. Our question answering system consists of the following core components: Question Anal- ysis, Passage Retrieval, Sentence Analysis and Answer Selection. These components employ various Natural Language Processing (NLP) and Machine Learning (ML) tools, a set of heuristics and dierent lexical resources. Seamless integration of the various components is one of the major challenges of QA system development. In order to facilitate our develop- ment process, we used the Unstructured Information Management Architecture (UIMA) as our underlying framework

    Part of Speech tagging for Amharic using Conditional Random Fields

    No full text
    We applied Conditional Random Fields (CRFs) to the tasks of Amharic word segmentation and POS tagging using a small annotated corpus of 1000 words. Given the size of the data and the large number of unknown words in the test corpus (80%), an accuracy of 84% for Amharic word segmentation and 74% for POS tagging is encouraging, indicating the applicability of CRFs for a morphologically complex language like Amharic

    Estimating Importance Features for Fact Mining (With a Case Study in Biography Mining)

    No full text
    We present a transparent model for ranking sentences that incorporates topic relevance as well as an aboutness and importance feature. We describe and compare five methods for estimating the importance feature. The two key features that we use are graph-based ranking and ranking based on reference corpora of sentences known to be important. Independently those features do not improve over the baseline, but combined they do. While our experimental evaluation focuses on informational queries about people, our importance estimation methods are completely general and can be applied to any topic.

    Feature Engineering and Post-Processing for Temporal Expression Recognition Using Conditional Random Fields

    No full text
    We present the results of feature engineering and post-processing experiments conducted on a temporal expression recognition task. The former explores the use of different kinds of tagging schemes and of exploiting a list of core temporal expressions during training. The latter is concerned with the use of this list for postprocessing the output of a system based on conditional random fields. We find that the incorporation of knowledge sources both for training and postprocessing improves recall, while the use of extended tagging schemes may help to offset the (mildly) negative impact on precision. Each of these approaches addresses a different aspect of the overall recognition performance. Taken separately, the impact on the overall performance is low, but by combining the approaches we achieve both high precision and high recall scores.
    corecore